Synthetic Generation of Multidimensional Data to Improve Classification Model Validity

نویسندگان

چکیده

This paper aims to compare Generative Adversarial Network (GAN) models and feature selection methods for generating synthetic data in order improve the validity of a classification model. The generation technique involves new samples from existing increase diversity help model generalize better. multidimensional aspect refers fact that it can have multiple features or variables describe it. GAN proven be effective preserving statistical properties original data. However, augmentation is crucial build robust accurate predictive models. By comparing different with on multi-dimensional datasets, this determine best combination support Data

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran

آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...

15 صفحه اول

Synthetic data generation for classification via uni-modal cluster interpolation

The observations used to classify data from real systems often vary as a result of changing operating conditions (e.g. velocity, load, temperature, etc.). Hence, to create accurate classification algorithms for these systems, observations from a large number of operating conditions must be used in algorithm training. This can be an arduous, expensive, and even dangerous task. Treating an operat...

متن کامل

Multinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data

We present probabilistic multinomial Dirichlet classification model for multidimensional data and Gaussian process priors. Here, we have considered efficient computational method that can be used to obtain the approximate posteriors for latent variables and parameters needed to define the multiclass Gaussian process classification model. We first investigated the process of inducing a posterior...

متن کامل

A Novel Approach to Model Generation for Heterogeneous Data Classification

Ensemble methods such as bagging and boosting have been successfully applied to classification problems. Two important issues associated with an ensemble approach are: how to generate models to construct an ensemble, and how to combine them for classification. In this paper, we focus on the problem of model generation for heterogeneous data classification. If we could partition heterogeneous da...

متن کامل

Synthetic Spotlight Sar Image Generation to Improve Geopositioning Accuracy

Many applications require the ability to use airborne sensor data to perform accurate geopositioning. Synthetic Aperture Radar’s (SAR) ability to image from long distances and in poor weather makes it advantageous for such geopositioning. However, the ability to accurately position using SAR data is reliant on having accurate sensor support data (sensor position, velocity, etc...). This paper e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Data and Information Quality

سال: 2023

ISSN: ['1936-1963', '1936-1955']

DOI: https://doi.org/10.1145/3603715